knowledge component
Pattern-based Knowledge Component Extraction from Student Code Using Representation Learning
Hoq, Muntasir, Pitts, Griffin, Lan, Andrew, Brusilovsky, Peter, Akram, Bita
Effective personalized learning in computer science education depends on accurately modeling what students know and what they need to learn. While Knowledge Components (KCs) provide a foundation for such modeling, automated KC extraction from student code is inherently challenging due to insufficient explainability of discovered KCs and the open-endedness of programming problems with significant structural variability across student solutions and complex interactions among programming concepts. In this work, we propose a novel, explainable framework for automated KC discovery through pattern-based KCs: recurring structural patterns within student code that capture the specific programming patterns and language constructs that students must master. Toward this, we train a Variational Autoencoder to generate important representative patterns from student code guided by an explainable, attention-based code representation model that identifies important correct and incorrect pattern implementations from student code. These patterns are then clustered to form pattern-based KCs. We evaluate our KCs using two well-established methods informed by Cognitive Science: learning curve analysis and Deep Knowledge Tracing (DKT). Experimental results demonstrate meaningful learning trajectories and significant improvements in DKT predictive performance over traditional KT methods. This work advances knowledge modeling in CS education by providing an automated, scalable, and explainable framework for identifying granular code patterns and algorithmic constructs, essential for student learning.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (10 more...)
DiaCDM: Cognitive Diagnosis in Teacher-Student Dialogues using the Initiation-Response-Evaluation Framework
Jia, Rui, Wei, Yuang, Li, Ruijia, Jiang, Yuan-Hao, Xie, Xinyu, Shen, Yaomin, Zhang, Min, Jiang, Bo
While cognitive diagnosis (CD) effectively assesses students' knowledge mastery from structured test data, applying it to real-world teacher-student dialogues presents two fundamental challenges. Traditional CD models lack a suitable framework for handling dynamic, unstructured dialogues, and it's difficult to accurately extract diagnostic semantics from lengthy dialogues. To overcome these hurdles, we propose DiaCDM, an innovative model. We've adapted the initiation-response-evaluation (IRE) framework from educational theory to design a diagnostic framework tailored for dialogue. We also developed a unique graph-based encoding method that integrates teacher questions with relevant knowledge components to capture key information more precisely. To our knowledge, this is the first exploration of cognitive diagnosis in a dialogue setting. Experiments on three real-world dialogue datasets confirm that DiaCDM not only significantly improves diagnostic accuracy but also enhances the results' interpretability, providing teachers with a powerful tool for assessing students' cognitive states. The code is available at https://github.com/Mind-Lab-ECNU/DiaCDM/tree/main.
- North America > United States (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Jiangxi Province > Nanchang (0.04)
The Discovery Engine: A Framework for AI-Driven Synthesis and Navigation of Scientific Knowledge Landscapes
Baulin, Vladimir, Cook, Austin, Friedman, Daniel, Lumiruusu, Janna, Pashea, Andrew, Rahman, Shagor, Waldeck, Benedikt
The prevailing model for disseminating scientific knowledge relies on individual publications dispersed across numerous journals and archives. This legacy system is ill suited to the recent exponential proliferation of publications, contributing to insurmountable information overload, issues surrounding reproducibility and retractions. We introduce the Discovery Engine, a framework to address these challenges by transforming an array of disconnected literature into a unified, computationally tractable representation of a scientific domain. Central to our approach is the LLM-driven distillation of publications into structured "knowledge artifacts," instances of a universal conceptual schema, complete with verifiable links to source evidence. These artifacts are then encoded into a high-dimensional Conceptual Tensor. This tensor serves as the primary, compressed representation of the synthesized field, where its labeled modes index scientific components (concepts, methods, parameters, relations) and its entries quantify their interdependencies. The Discovery Engine allows dynamic "unrolling" of this tensor into human-interpretable views, such as explicit knowledge graphs (the CNM graph) or semantic vector spaces, for targeted exploration. Crucially, AI agents operate directly on the graph using abstract mathematical and learned operations to navigate the knowledge landscape, identify non-obvious connections, pinpoint gaps, and assist researchers in generating novel knowledge artifacts (hypotheses, designs). By converting literature into a structured tensor and enabling agent-based interaction with this compact representation, the Discovery Engine offers a new paradigm for AI-augmented scientific inquiry and accelerated discovery.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
Small but Significant: On the Promise of Small Language Models for Accessible AIED
Wei, Yumou, Carvalho, Paulo, Stamper, John
GPT has become nearly synonymous with large language models (LLMs), an increasingly popular term in AIED proceedings. A simple keyword-based search reveals that 61% of the 76 long and short papers presented at AIED 2024 describe novel solutions using LLMs to address some of the long-standing challenges in education, and 43% specifically mention GPT. Although LLMs pioneered by GPT create exciting opportunities to strengthen the impact of AI on education, we argue that the field's predominant focus on GPT and other resource-intensive LLMs (with more than 10B parameters) risks neglecting the potential impact that small language models (SLMs) can make in providing resource-constrained institutions with equitable and affordable access to high-quality AI tools. Supported by positive results on knowledge component (KC) discovery, a critical challenge in AIED, we demonstrate that SLMs such as Phi-2 can produce an effective solution without elaborate prompting strategies. Hence, we call for more attention to developing SLM-based AIED approaches.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- South America > Brazil > Pernambuco > Recife (0.04)
- (3 more...)
- Instructional Material > Course Syllabus & Notes (0.46)
- Research Report > Promising Solution (0.34)
- Education > Educational Setting (0.94)
- Education > Educational Technology > Educational Software > Computer Based Training (0.47)
- Education > Assessment & Standards > Student Performance (0.46)
Can Large Language Models Match Tutoring System Adaptivity? A Benchmarking Study
Borchers, Conrad, Shou, Tianze
Large Language Models (LLMs) hold promise as dynamic instructional aids. Yet, it remains unclear whether LLMs can replicate the adaptivity of intelligent tutoring systems (ITS)--where student knowledge and pedagogical strategies are explicitly modeled. We propose a prompt variation framework to assess LLM-generated instructional moves' adaptivity and pedagogical soundness across 75 real-world tutoring scenarios from an ITS. We systematically remove key context components (e.g., student errors and knowledge components) from prompts to create variations of each scenario. Three representative LLMs (Llama3-8B, Llama3-70B, and GPT-4o) generate 1,350 instructional moves. We use text embeddings and randomization tests to measure how the omission of each context feature impacts the LLMs' outputs (adaptivity) and a validated tutor-training classifier to evaluate response quality (pedagogical soundness). Surprisingly, even the best-performing model only marginally mimics the adaptivity of ITS. Specifically, Llama3-70B demonstrates statistically significant adaptivity to student errors. Although Llama3-8B's recommendations receive higher pedagogical soundness scores than the other models, it struggles with instruction-following behaviors, including output formatting. By contrast, GPT-4o reliably adheres to instructions but tends to provide overly direct feedback that diverges from effective tutoring, prompting learners with open-ended questions to gauge knowledge. Given these results, we discuss how current LLM-based tutoring is unlikely to produce learning benefits rivaling known-to-be-effective ITS tutoring. Through our open-source benchmarking code, we contribute a reproducible method for evaluating LLMs' instructional adaptivity and fidelity.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning
Fang, Ying, He, Bo, Liu, Zhi, Liu, Sannyuya, Yan, Zhonghua, Sun, Jianwen
Xiaomai is an intelligent tutoring system (ITS) designed to help Chinese college students in learning advanced mathematics and preparing for the graduate school math entrance exam. This study investigates two distinctive features within Xiaomai: the incorporation of free-response questions with automatic feedback and the metacognitive element of reflecting on self-made errors. An experiment was conducted to evaluate the impact of these features on mathematics learning. One hundred and twenty college students were recruited and randomly assigned to four conditions: (1) multiple-choice questions without reflection, (2) multiple-choice questions with reflection, (3) free-response questions without reflection, and (4) free-response questions with reflection. Students in the multiple-choice conditions demonstrated better practice performance and learning outcomes compared to their counterparts in the freeresponse conditions. Additionally, the incorporation of error reflection did not yield a significant impact on students' practice performance or learning outcomes. These findings indicate that current design of free-response questions and the metacognitive feature of error reflection do not enhance the efficacy of the math ITS. This study highlights the need for redesign or enhancement of Xiaomai to optimize its effectiveness in facilitating advanced mathematics learning.
- Asia > China > Hubei Province > Wuhan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Bergen County > Mahwah (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > Strength High (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (0.88)
- Information Technology > Artificial Intelligence > Cognitive Science (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Understanding (0.61)
Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information
Moon, Hyeongdon, Davis, Richard, Neshaei, Seyed Parsa, Dillenbourg, Pierre
Knowledge tracing models have enabled a range of intelligent tutoring systems to provide feedback to students. However, existing methods for knowledge tracing in learning sciences are predominantly reliant on statistical data and instructor-defined knowledge components, making it challenging to integrate AI-generated educational content with traditional established methods. We propose a method for automatically extracting knowledge components from educational content using instruction-tuned large multimodal models. We validate this approach by comprehensively evaluating it against knowledge tracing benchmarks in five domains. Our results indicate that the automatically extracted knowledge components can effectively replace human-tagged labels, offering a promising direction for enhancing intelligent tutoring systems in limited-data scenarios, achieving more explainable assessments in educational settings, and laying the groundwork for automated assessment.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- Europe > Switzerland (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- (3 more...)
- Research Report > New Finding (0.88)
- Instructional Material > Course Syllabus & Notes (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Enhancing Explainability of Knowledge Learning Paths: Causal Knowledge Networks
Wei, Yuang, Zhou, Yizhou, Jiang, Yuan-Hao, Jiang, Bo
A reliable knowledge structure is a prerequisite for building effective adaptive learning systems and intelligent tutoring systems. Pursuing an explainable and trustworthy knowledge structure, we propose a method for constructing causal knowledge networks. This approach leverages Bayesian networks as a foundation and incorporates causal relationship analysis to derive a causal network. Additionally, we introduce a dependable knowledge-learning path recommendation technique built upon this framework, improving teaching and learning quality while maintaining transparency in the decision-making process.
- Asia > China > Shanghai > Shanghai (0.05)
- Asia > Singapore (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (2 more...)
- Research Report (1.00)
- Instructional Material (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Private Knowledge Sharing in Distributed Learning: A Survey
Supeksala, Yasas, Nguyen, Dinh C., Ding, Ming, Ranbaduge, Thilina, Chua, Calson, Zhang, Jun, Li, Jun, Poor, H. Vincent
The rise of Artificial Intelligence (AI) has revolutionized numerous industries and transformed the way society operates. Its widespread use has led to the distribution of AI and its underlying data across many intelligent systems. In this light, it is crucial to utilize information in learning processes that are either distributed or owned by different entities. As a result, modern data-driven services have been developed to integrate distributed knowledge entities into their outcomes. In line with this goal, the latest AI models are frequently trained in a decentralized manner. Distributed learning involves multiple entities working together to make collective predictions and decisions. However, this collaboration can also bring about security vulnerabilities and challenges. This paper provides an in-depth survey on private knowledge sharing in distributed learning, examining various knowledge components utilized in leading distributed learning architectures. Our analysis sheds light on the most critical vulnerabilities that may arise when using these components in a distributed setting. We further identify and examine defensive strategies for preserving the privacy of these knowledge components and preventing malicious parties from manipulating or accessing the knowledge information. Finally, we highlight several key limitations of knowledge sharing in distributed learning and explore potential avenues for future research.
- Oceania > Australia > Victoria > Melbourne (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (3 more...)
- Research Report (1.00)
- Overview (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Education (0.92)
A Toolbox for Modelling Engagement with Educational Videos
Qiu, Yuxiang, Djemili, Karim, Elezi, Denis, Shalman, Aaneel, Pérez-Ortiz, María, Yilmaz, Emine, Shawe-Taylor, John, Bulathwela, Sahan
With the advancement and utility of Artificial Intelligence (AI), personalising education to a global population could be a cornerstone of new educational systems in the future. This work presents the PEEKC dataset and the TrueLearn Python library, which contains a dataset and a series of online learner state models that are essential to facilitate research on learner engagement modelling.TrueLearn family of models was designed following the "open learner" concept, using humanly-intuitive user representations. This family of scalable, online models also help end-users visualise the learner models, which may in the future facilitate user interaction with their models/recommenders. The extensive documentation and coding examples make the library highly accessible to both machine learning developers and educational data mining and learning analytics practitioners. The experiments show the utility of both the dataset and the library with predictive performance significantly exceeding comparative baseline models. The dataset contains a large amount of AI-related educational videos, which are of interest for building and validating AI-specific educational recommenders.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Poland > Masovia Province > Warsaw (0.04)
- North America > United States > Maryland > Howard County > Hanover (0.04)
- Europe > United Kingdom (0.04)
- Instructional Material > Course Syllabus & Notes (0.69)
- Research Report > Experimental Study (0.68)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting > Online (1.00)